Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AsmPrinter][DebugNames] Implement DW_IDX_parent entries #77457

Merged
merged 6 commits into from
Jan 19, 2024

Conversation

felipepiovezan
Copy link
Contributor

@felipepiovezan felipepiovezan commented Jan 9, 2024

This implements the ideas discussed in 1.

To summarize, this commit changes AsmPrinter so that it outputs DW_IDX_parent information for debug_name entries. It will enable debuggers to speed up queries for fully qualified types (based on a DWARFDeclContext) significantly, as debuggers will no longer need to parse the entire CU in order to inspect the parent chain of a DIE. Instead, a debugger can simply take the parent DIE offset from the accelerator table and peek at its name in the debug_info/debug_str sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent attribute:

  1. DW_FORM_ref4, which points to the accelerator table entry for the parent.
  2. DW_FORM_flag_present, when the entry has a parent that is not in the table (that is, the parent doesn't have a name, or isn't allowed to be in the table as per the DWARF spec). This is space-efficient, since it takes 0 bytes.

The implementation works by:

  1. Changing how abbreviations are encoded (so that they encode which form, if
    any, was used to encode IDX_Parent)
  2. Creating an MCLabel per accelerator table entry, so that they may be
    referred by IDX_parent references.

When all patches related to this are merged, we are able to show that evaluating an expression such as:

lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking at its impact on object file size:

ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit already brings support to "base" linker, but it does not attempt to modify the parallel linker. Accelerator entries refer to the corresponding DIE offset, and this patch also requires the parent DIE offset -- it's not clear how the parallel linker can access this. It may be obvious to someone familiar with it, but it would be nice to get help from its authors.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 9, 2024

@llvm/pr-subscribers-debuginfo

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

This implements the ideas discussed in 1.

To summarize, this commit changes AsmPrinter so that it outputs DW_IDX_parent information for debug_name entries. It will enable debuggers to speed up queries for fully qualified types (based on a DWARFDeclContext) significantly, as debuggers will no longer need to parse the entire CU in order to inspect the parent chain of a DIE. Instead, a debugger can simply take the parent DIE offset from the accelerator table and peek at its name in the debug_info/debug_str sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent attribute:

  1. DW_FORM_ref4, which points to the accelerator table entry for the parent.
  2. DW_FORM_flag_present, when the entry has a parent that is not in the table (that is, the parent doesn't have a name, or isn't allowed to be in the table as per the DWARF spec). This is space-efficient, since it takes 0 bytes.

When all patches related to this are merged, we are able to show that evaluating an expression such as:

lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking at its impact on object file size:

ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit already brings support to "base" linker, but it does not attempt to modify the parallel linker. Accelerator entries refer to the corresponding DIE offset, and this patch also requires the parent DIE offset -- it's not clear how the parallel linker can access this. It may be obvious to someone familiar with it, but it would be nice to get help from its authors.


Patch is 28.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77457.diff

10 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/AccelTable.h (+18-5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp (+104-14)
  • (modified) llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp (+8)
  • (modified) llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp (+2-1)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp (+15-1)
  • (modified) llvm/test/DebugInfo/X86/debug-names-dwarf64.ll (+12-4)
  • (modified) llvm/test/DebugInfo/X86/debug-names-end-of-list.ll (+4-2)
  • (modified) llvm/test/DebugInfo/X86/debug-names-types.ll (+39-19)
  • (modified) llvm/test/tools/dsymutil/ARM/accel-imported-declarations.test (+2)
  • (modified) llvm/test/tools/dsymutil/ARM/dwarf5-dwarf4-combination-macho.test (+8-4)
diff --git a/llvm/include/llvm/CodeGen/AccelTable.h b/llvm/include/llvm/CodeGen/AccelTable.h
index 0638fbffda4f31..12ed123ee90a02 100644
--- a/llvm/include/llvm/CodeGen/AccelTable.h
+++ b/llvm/include/llvm/CodeGen/AccelTable.h
@@ -270,9 +270,12 @@ class DWARF5AccelTableData : public AccelTableData {
 
   DWARF5AccelTableData(const DIE &Die, const uint32_t UnitID,
                        const bool IsTU = false);
-  DWARF5AccelTableData(const uint64_t DieOffset, const unsigned DieTag,
-                       const unsigned UnitID, const bool IsTU = false)
-      : OffsetVal(DieOffset), DieTag(DieTag), UnitID(UnitID), IsTU(IsTU) {}
+  DWARF5AccelTableData(const uint64_t DieOffset,
+                       const std::optional<uint64_t> ParentOffset,
+                       const unsigned DieTag, const unsigned UnitID,
+                       const bool IsTU = false)
+      : OffsetVal(DieOffset), ParentOffset(ParentOffset), DieTag(DieTag),
+        UnitID(UnitID), IsTU(IsTU) {}
 
 #ifndef NDEBUG
   void print(raw_ostream &OS) const override;
@@ -287,14 +290,23 @@ class DWARF5AccelTableData : public AccelTableData {
   bool isTU() const { return IsTU; }
   void normalizeDIEToOffset() {
     assert(!isNormalized() && "Accessing offset after normalizing.");
-    OffsetVal = std::get<const DIE *>(OffsetVal)->getOffset();
+    const DIE *Entry = std::get<const DIE *>(OffsetVal);
+    ParentOffset = Entry->getParent() ? Entry->getParent()->getOffset()
+                                      : std::optional<uint64_t>();
+    OffsetVal = Entry->getOffset();
   }
   bool isNormalized() const {
     return std::holds_alternative<uint64_t>(OffsetVal);
   }
 
+  std::optional<uint64_t> getParentDieOffset() const {
+    assert(isNormalized() && "Accessing DIE Offset before normalizing.");
+    return ParentOffset;
+  }
+
 protected:
   std::variant<const DIE *, uint64_t> OffsetVal;
+  std::optional<uint64_t> ParentOffset;
   uint32_t DieTag : 16;
   uint32_t UnitID : 15;
   uint32_t IsTU : 1;
@@ -341,7 +353,8 @@ class DWARF5AccelTable : public AccelTable<DWARF5AccelTableData> {
   void addTypeEntries(DWARF5AccelTable &Table) {
     for (auto &Entry : Table.getEntries()) {
       for (auto *Data : Entry.second.getValues<DWARF5AccelTableData *>()) {
-        addName(Entry.second.Name, Data->getDieOffset(), Data->getDieTag(),
+        addName(Entry.second.Name, Data->getDieOffset(),
+                Data->getParentDieOffset(), Data->getDieTag(),
                 Data->getUnitID(), true);
       }
     }
diff --git a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
index b72c17aa6f54a3..36ffa872c024e8 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
@@ -13,6 +13,7 @@
 #include "llvm/CodeGen/AccelTable.h"
 #include "DwarfCompileUnit.h"
 #include "DwarfUnit.h"
+#include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/BinaryFormat/Dwarf.h"
@@ -207,7 +208,7 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
   };
 
   Header Header;
-  DenseMap<uint32_t, SmallVector<DWARF5AccelTableData::AttributeEncoding, 2>>
+  DenseMap<uint32_t, SmallVector<DWARF5AccelTableData::AttributeEncoding, 3>>
       Abbreviations;
   ArrayRef<std::variant<MCSymbol *, uint64_t>> CompUnits;
   ArrayRef<std::variant<MCSymbol *, uint64_t>> TypeUnits;
@@ -220,6 +221,8 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
   MCSymbol *EntryPool = Asm->createTempSymbol("names_entries");
   // Indicates if this module is built with Split Dwarf enabled.
   bool IsSplitDwarf = false;
+  // Stores the DIE offsets which are indexed by this table.
+  DenseSet<uint64_t> IndexedOffsets;
 
   void populateAbbrevsMap();
 
@@ -228,8 +231,11 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
   void emitBuckets() const;
   void emitStringOffsets() const;
   void emitAbbrevs() const;
-  void emitEntry(const DWARF5AccelTableData &Entry) const;
-  void emitData() const;
+  void
+  emitEntry(const DWARF5AccelTableData &Entry,
+            const DenseMap<uint64_t, MCSymbol *> &DIEOffsetToAccelEntryLabel,
+            DenseSet<MCSymbol *> &EmittedAccelEntrySymbols) const;
+  void emitData();
 
 public:
   Dwarf5AccelTableWriter(
@@ -395,23 +401,65 @@ void Dwarf5AccelTableWriter::Header::emit(Dwarf5AccelTableWriter &Ctx) {
   Asm->OutStreamer->emitBytes({AugmentationString, AugmentationStringSize});
 }
 
-static uint32_t constexpr LowerBitSize = dwarf::DW_IDX_type_hash;
+enum IdxParentEncoding : uint8_t {
+  NoIndexedParent = 0, // Parent information present but parent isn't indexed.
+  Ref4 = 1,            // Parent information present and parent is indexed.
+  NoParent = 2,        // Parent information missing.
+};
+
+static uint32_t constexpr NumBitsIdxParent = 2;
+
+uint8_t encodeIdxParent(std::optional<dwarf::Form> MaybeParentForm) {
+  if (!MaybeParentForm)
+    return NoParent;
+  switch (*MaybeParentForm) {
+  case dwarf::Form::DW_FORM_flag_present:
+    return NoIndexedParent;
+  case dwarf::Form::DW_FORM_ref4:
+    return Ref4;
+  default:
+    break;
+  }
+  // This is not crashing on bad input: we should only reach this if the
+  // internal compiler logic is faulty; see getFormForIdxParent.
+  llvm_unreachable("Bad form for IDX_parent");
+}
+
+static uint32_t constexpr ParentBitOffset = dwarf::DW_IDX_type_hash;
+static uint32_t constexpr TagBitOffset = ParentBitOffset + NumBitsIdxParent;
 static uint32_t getTagFromAbbreviationTag(const uint32_t AbbrvTag) {
-  return AbbrvTag >> LowerBitSize;
+  return AbbrvTag >> TagBitOffset;
 }
 
 /// Constructs a unique AbbrevTag that captures what a DIE accesses.
 /// Using this tag we can emit a unique abbreviation for each DIE.
 static uint32_t constructAbbreviationTag(
     const unsigned Tag,
-    const std::optional<DWARF5AccelTable::UnitIndexAndEncoding> &EntryRet) {
+    const std::optional<DWARF5AccelTable::UnitIndexAndEncoding> &EntryRet,
+    std::optional<dwarf::Form> MaybeParentForm) {
   uint32_t AbbrvTag = 0;
   if (EntryRet)
     AbbrvTag |= 1 << EntryRet->Encoding.Index;
   AbbrvTag |= 1 << dwarf::DW_IDX_die_offset;
-  AbbrvTag |= Tag << LowerBitSize;
+  AbbrvTag |= 1 << dwarf::DW_IDX_parent;
+  AbbrvTag |= encodeIdxParent(MaybeParentForm) << ParentBitOffset;
+  AbbrvTag |= Tag << TagBitOffset;
   return AbbrvTag;
 }
+
+static std::optional<dwarf::Form>
+getFormForIdxParent(const DenseSet<uint64_t> &IndexedOffsets,
+                    std::optional<uint64_t> ParentOffset) {
+  // No parent information
+  if (!ParentOffset)
+    return std::nullopt;
+  // Parent is indexed by this table.
+  if (IndexedOffsets.contains(*ParentOffset))
+    return dwarf::Form::DW_FORM_ref4;
+  // Parent is not indexed by this table.
+  return dwarf::Form::DW_FORM_flag_present;
+}
+
 void Dwarf5AccelTableWriter::populateAbbrevsMap() {
   for (auto &Bucket : Contents.getBuckets()) {
     for (auto *Hash : Bucket) {
@@ -419,12 +467,17 @@ void Dwarf5AccelTableWriter::populateAbbrevsMap() {
         std::optional<DWARF5AccelTable::UnitIndexAndEncoding> EntryRet =
             getIndexForEntry(*Value);
         unsigned Tag = Value->getDieTag();
-        uint32_t AbbrvTag = constructAbbreviationTag(Tag, EntryRet);
+        std::optional<dwarf::Form> MaybeParentForm =
+            getFormForIdxParent(IndexedOffsets, Value->getParentDieOffset());
+        uint32_t AbbrvTag =
+            constructAbbreviationTag(Tag, EntryRet, MaybeParentForm);
         if (Abbreviations.count(AbbrvTag) == 0) {
-          SmallVector<DWARF5AccelTableData::AttributeEncoding, 2> UA;
+          SmallVector<DWARF5AccelTableData::AttributeEncoding, 3> UA;
           if (EntryRet)
             UA.push_back(EntryRet->Encoding);
           UA.push_back({dwarf::DW_IDX_die_offset, dwarf::DW_FORM_ref4});
+          if (MaybeParentForm)
+            UA.push_back({dwarf::DW_IDX_parent, *MaybeParentForm});
           Abbreviations.try_emplace(AbbrvTag, UA);
         }
       }
@@ -496,15 +549,32 @@ void Dwarf5AccelTableWriter::emitAbbrevs() const {
 }
 
 void Dwarf5AccelTableWriter::emitEntry(
-    const DWARF5AccelTableData &Entry) const {
+    const DWARF5AccelTableData &Entry,
+    const DenseMap<uint64_t, MCSymbol *> &DIEOffsetToAccelEntryLabel,
+    DenseSet<MCSymbol *> &EmittedAccelEntrySymbols) const {
   std::optional<DWARF5AccelTable::UnitIndexAndEncoding> EntryRet =
       getIndexForEntry(Entry);
-  uint32_t AbbrvTag = constructAbbreviationTag(Entry.getDieTag(), EntryRet);
+  std::optional<uint64_t> MaybeParentOffset = Entry.getParentDieOffset();
+  std::optional<dwarf::Form> MaybeParentForm =
+      getFormForIdxParent(IndexedOffsets, MaybeParentOffset);
+  uint32_t AbbrvTag =
+      constructAbbreviationTag(Entry.getDieTag(), EntryRet, MaybeParentForm);
   auto AbbrevIt = Abbreviations.find(AbbrvTag);
   assert(AbbrevIt != Abbreviations.end() &&
          "Why wasn't this abbrev generated?");
   assert(getTagFromAbbreviationTag(AbbrevIt->first) == Entry.getDieTag() &&
          "Invalid Tag");
+
+  auto EntrySymbolIt = DIEOffsetToAccelEntryLabel.find(Entry.getDieOffset());
+  assert(EntrySymbolIt != DIEOffsetToAccelEntryLabel.end());
+  MCSymbol *EntrySymbol = EntrySymbolIt->getSecond();
+
+  // Emit the label for this Entry, so that IDX_parents may refer to it.
+  // Note: a DIE may have multiple accelerator Entries; this check avoids
+  // creating/emitting multiple labels for the same DIE.
+  if (EmittedAccelEntrySymbols.insert(EntrySymbol).second)
+    Asm->OutStreamer->emitLabel(EntrySymbol);
+
   Asm->emitULEB128(AbbrevIt->first, "Abbreviation code");
 
   for (const auto &AttrEnc : AbbrevIt->second) {
@@ -520,20 +590,34 @@ void Dwarf5AccelTableWriter::emitEntry(
       assert(AttrEnc.Form == dwarf::DW_FORM_ref4);
       Asm->emitInt32(Entry.getDieOffset());
       break;
+    case dwarf::DW_IDX_parent: {
+      if (AttrEnc.Form == dwarf::Form::DW_FORM_flag_present)
+        break;
+      auto ParentSymbolIt = DIEOffsetToAccelEntryLabel.find(*MaybeParentOffset);
+      assert(ParentSymbolIt != DIEOffsetToAccelEntryLabel.end());
+      Asm->emitLabelDifference(ParentSymbolIt->getSecond(), EntryPool, 4);
+      break;
+    }
     default:
       llvm_unreachable("Unexpected index attribute!");
     }
   }
 }
 
-void Dwarf5AccelTableWriter::emitData() const {
+void Dwarf5AccelTableWriter::emitData() {
+  DenseMap<uint64_t, MCSymbol*> DIEOffsetToAccelEntryLabel;
+
+  for (auto Offset : IndexedOffsets)
+    DIEOffsetToAccelEntryLabel[Offset] = Asm->createTempSymbol("symbol");
+
   Asm->OutStreamer->emitLabel(EntryPool);
+  DenseSet<MCSymbol *> EmittedAccelEntrySymbols;
   for (auto &Bucket : Contents.getBuckets()) {
     for (auto *Hash : Bucket) {
       // Remember to emit the label for our offset.
       Asm->OutStreamer->emitLabel(Hash->Sym);
-      for (const auto *Value : Hash->Values)
-        emitEntry(*static_cast<const DWARF5AccelTableData *>(Value));
+      for (const auto *Value : Hash->getValues<DWARF5AccelTableData*>())
+        emitEntry(*Value, DIEOffsetToAccelEntryLabel, EmittedAccelEntrySymbols);
       Asm->OutStreamer->AddComment("End of list: " + Hash->Name.getString());
       Asm->emitInt8(0);
     }
@@ -555,6 +639,12 @@ Dwarf5AccelTableWriter::Dwarf5AccelTableWriter(
       CompUnits(CompUnits), TypeUnits(TypeUnits),
       getIndexForEntry(std::move(getIndexForEntry)),
       IsSplitDwarf(IsSplitDwarf) {
+
+  for (auto &Bucket : Contents.getBuckets())
+    for (auto *Hash : Bucket)
+      for (auto *Value : Hash->getValues<DWARF5AccelTableData *>())
+        IndexedOffsets.insert(Value->getDieOffset());
+
   populateAbbrevsMap();
 }
 
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
index 8d76c3bcf672e7..66d9b0ae9841fe 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
@@ -2252,14 +2252,22 @@ void DWARFLinker::emitAcceleratorEntriesForUnit(CompileUnit &Unit) {
       TheDwarfEmitter->emitPubTypesForUnit(Unit);
     } break;
     case AccelTableKind::DebugNames: {
+      auto ParentOffsetOrNull = [](const DIE *Die) -> std::optional<uint64_t> {
+        if (const DIE *Parent = Die->getParent())
+          return Die->getParent()->getOffset();
+        return std::nullopt;
+      };
       for (const auto &Namespace : Unit.getNamespaces())
         DebugNames.addName(Namespace.Name, Namespace.Die->getOffset(),
+                           ParentOffsetOrNull(Namespace.Die),
                            Namespace.Die->getTag(), Unit.getUniqueID());
       for (const auto &Pubname : Unit.getPubnames())
         DebugNames.addName(Pubname.Name, Pubname.Die->getOffset(),
+                           ParentOffsetOrNull(Pubname.Die),
                            Pubname.Die->getTag(), Unit.getUniqueID());
       for (const auto &Pubtype : Unit.getPubtypes())
         DebugNames.addName(Pubtype.Name, Pubtype.Die->getOffset(),
+                           ParentOffsetOrNull(Pubtype.Die),
                            Pubtype.Die->getTag(), Unit.getUniqueID());
     } break;
     }
diff --git a/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp b/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
index bb59cbfdb3477a..790229e213b2bd 100644
--- a/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
+++ b/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
@@ -1374,7 +1374,8 @@ void DWARFLinkerImpl::emitDWARFv5DebugNamesSection(const Triple &TargetTriple) {
       case DwarfUnit::AccelType::Namespace:
       case DwarfUnit::AccelType::Type: {
         DebugNames->addName(*DebugStrStrings.getExistingEntry(Info.String),
-                            Info.OutOffset, Info.Tag, CU->getUniqueID());
+                            Info.OutOffset, std::nullopt /*ParentDIEOffset*/,
+                            Info.Tag, CU->getUniqueID());
       } break;
 
       default:
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
index d25b732fdba3f0..c4c14f5e2c9d36 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 #include "llvm/DebugInfo/DWARF/DWARFVerifier.h"
 #include "llvm/ADT/IntervalMap.h"
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/BinaryFormat/Dwarf.h"
 #include "llvm/DebugInfo/DWARF/DWARFAbbreviationDeclaration.h"
@@ -1272,6 +1273,20 @@ unsigned DWARFVerifier::verifyNameIndexAttribute(
           NI.getUnitOffset(), Abbr.Code, AttrEnc.Form, dwarf::DW_FORM_data8);
       return 1;
     }
+    return 0;
+  }
+
+  if (AttrEnc.Index == dwarf::DW_IDX_parent) {
+    constexpr static auto AllowedForms = {dwarf::Form::DW_FORM_flag_present,
+                                          dwarf::Form::DW_FORM_ref4};
+    if (!is_contained(AllowedForms, AttrEnc.Form)) {
+      error() << formatv("NameIndex @ {0:x}: Abbreviation {1:x}: DW_IDX_parent "
+                         "uses an unexpected form {2} (should be "
+                         "DW_FORM_ref4 or DW_FORM_flag_present).\n",
+                         NI.getUnitOffset(), Abbr.Code, AttrEnc.Form);
+      return 1;
+    }
+    return 0;
   }
 
   // A list of known index attributes and their expected form classes.
@@ -1286,7 +1301,6 @@ unsigned DWARFVerifier::verifyNameIndexAttribute(
       {dwarf::DW_IDX_compile_unit, DWARFFormValue::FC_Constant, {"constant"}},
       {dwarf::DW_IDX_type_unit, DWARFFormValue::FC_Constant, {"constant"}},
       {dwarf::DW_IDX_die_offset, DWARFFormValue::FC_Reference, {"reference"}},
-      {dwarf::DW_IDX_parent, DWARFFormValue::FC_Constant, {"constant"}},
   };
 
   ArrayRef<FormClassTable> TableRef(Table);
diff --git a/llvm/test/DebugInfo/X86/debug-names-dwarf64.ll b/llvm/test/DebugInfo/X86/debug-names-dwarf64.ll
index b94bdbcb12010c..a6855d77a34ac2 100644
--- a/llvm/test/DebugInfo/X86/debug-names-dwarf64.ll
+++ b/llvm/test/DebugInfo/X86/debug-names-dwarf64.ll
@@ -30,21 +30,25 @@
 ; CHECK-NEXT:     CU[0]: 0x00000000
 ; CHECK-NEXT:   ]
 ; CHECK-NEXT:   Abbreviations [
+; CHECK-NEXT:     Abbreviation [[ABBREV_LABEL:0x[0-9a-f]*]] {
+; CHECK-NEXT:       Tag: DW_TAG_label
+; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_ref4
+; CHECK-NEXT:     }
 ; CHECK-NEXT:     Abbreviation [[ABBREV:0x[0-9a-f]*]] {
 ; CHECK-NEXT:       Tag: DW_TAG_base_type
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:     Abbreviation [[ABBREV1:0x[0-9a-f]*]] {
 ; CHECK-NEXT:       Tag: DW_TAG_variable
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:     Abbreviation [[ABBREV_SP:0x[0-9a-f]*]] {
 ; CHECK-NEXT:       Tag: DW_TAG_subprogram
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
-; CHECK-NEXT:     }
-; CHECK-NEXT:     Abbreviation [[ABBREV_LABEL:0x[0-9a-f]*]] {
-; CHECK-NEXT:       Tag: DW_TAG_label
-; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:   ]
 ; CHECK-NEXT:   Bucket 0 [
@@ -55,6 +59,7 @@
 ; CHECK-NEXT:         Abbrev: [[ABBREV]]
 ; CHECK-NEXT:         Tag: DW_TAG_base_type
 ; CHECK-NEXT:         DW_IDX_die_offset: [[TYPEDIE]]
+; CHECK-NEXT:         DW_IDX_parent: true
 ; CHECK-NEXT:       }
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:   ]
@@ -66,6 +71,7 @@
 ; CHECK-NEXT:         Abbrev: [[ABBREV1]]
 ; CHECK-NEXT:         Tag: DW_TAG_variable
 ; CHECK-NEXT:         DW_IDX_die_offset: [[VARDIE]]
+; CHECK-NEXT:         DW_IDX_parent: true
 ; CHECK-NEXT:       }
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:     Name 3 {
@@ -75,6 +81,7 @@
 ; CHECK-NEXT:         Abbrev: [[ABBREV_SP]]
 ; CHECK-NEXT:         Tag: DW_TAG_subprogram
 ; CHECK-NEXT:         DW_IDX_die_offset: [[SPDIE]]
+; CHECK-NEXT:         DW_IDX_parent: true
 ; CHECK-NEXT:       }
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:   ]
@@ -89,6 +96,7 @@
 ; CHECK-NEXT:         Abbrev: [[ABBREV_LABEL]]
 ; CHECK-NEXT:         Tag: DW_TAG_label
 ; CHECK-NEXT:         DW_IDX_die_offset: [[LABELDIE]]
+; CHECK-NEXT:         DW_IDX_parent: 0x{{.*}}
 ; CHECK-NEXT:       }
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:   ]
diff --git a/llvm/test/DebugInfo/X86/debug-names-end-of-list.ll b/llvm/test/DebugInfo/X86/debug-names-end-of-list.ll
index f57168a8eff386..b8d4046201c342 100644
--- a/llvm/test/DebugInfo/X86/debug-names-end-of-list.ll
+++ b/llvm/test/DebugInfo/X86/debug-names-end-of-list.ll
@@ -6,8 +6,10 @@
 
 ; CHECK:   .section .debug_names,"",@progbits
 ; CHECK: .Lnames_entries0:
-; CHECK:   .byte 0    # End of list: int
-; CHECK:   .byte 0    # End of list: foo
+; CHECK:   .byte 0
+; CHECK-NEXT:   # End of list: int
+; CHECK:   .byte 0
+; CHECK-NEXT:   # End of list: foo
 
 @foo = common dso_local global i32 0, align 4, !dbg !5
 
diff --git a/llvm/test/DebugInfo/X86/debug-names-types.ll b/llvm/test/DebugInfo/X86/debug-names-types.ll
index 501b7efd88eb9a..c44e82578f14a4 100644
--- a/llvm/test/DebugInfo/X86/debug-names-types.ll
+++ b/llvm/test/DebugInfo/X86/debug-names-types.ll
@@ -37,27 +37,32 @@
 ; CHECK-NEXT:        LocalTU[0]: 0x00000000
 ; CHECK-NEXT:      ]
 ; CHECK:        Abbreviations [
-; CHECK-NEXT:     Abbreviation [[ABBREV1:0x[0-9a-f]*]] {
+; CHECK-NEXT:     Abbreviation [[ABBREV3:0x[0-9a-f]*]] {
 ; CHECK-NEXT:       Tag: DW_TAG_structure_type
+; CHECK-NEXT:       DW_IDX_type_unit: DW_FORM_data1
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
-; CHECK-NEXT:     Abbreviation [[ABBREV3:0x[0-9a-f]*]] {
-; CHECK-NEXT:       Tag: DW_TAG_structure_type
+; CHECK-NEXT:     Abbreviation [[ABBREV4:0x[0-9a-f]*]] {
+; CHECK-NEXT:       Tag: DW_TAG_base_type
 ; CHECK-NEXT:       DW_IDX_type_unit: DW_FORM_data1
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
 ; CHECK-NEXT:     Abbreviation [[ABBREV:0x[0-9a-f]*]] {
 ; CHECK-NEXT:       Tag: DW_TAG_base_type
 ; CHECK-NEXT:       DW_IDX_die_offset: DW_FORM_ref4
+; CHECK-NEXT:       DW_IDX_parent: DW_FORM_flag_present
 ; CHECK-NEXT:     }
-; CHECK-NEXT:     Abbreviation [[ABBREV2:0x[0-9a-f]*]] {
-; CHECK-NEXT:       Tag...
[truncated]

This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs DW_IDX_parent
information for debug_name entries. It will enable debuggers to speed up queries
for fully qualified types (based on a DWARFDeclContext) significantly, as
debuggers will no longer need to parse the entire CU in order to inspect the
parent chain of a DIE. Instead, a debugger can simply take the parent DIE offset
from the accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the table
(that is, the parent doesn't have a name, or isn't allowed to be in the table as
per the DWARF spec). This is space-efficient, since it takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.

When all patches related to this are merged, we are able to show that evaluating
an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking at its
impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit already
brings support to "base" linker, but it does not attempt to modify the parallel
linker. Accelerator entries refer to the corresponding DIE offset, and this
patch also requires the parent DIE offset -- it's not clear how the parallel
linker can access this. It may be obvious to someone familiar with it, but it
would be nice to get help from its authors.

[1]: https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
@felipepiovezan
Copy link
Contributor Author

(Updated commit message)

@@ -395,36 +401,83 @@ void Dwarf5AccelTableWriter::Header::emit(Dwarf5AccelTableWriter &Ctx) {
Asm->OutStreamer->emitBytes({AugmentationString, AugmentationStringSize});
}

static uint32_t constexpr LowerBitSize = dwarf::DW_IDX_type_hash;
enum IdxParentEncoding : uint8_t {
NoIndexedParent = 0, // Parent information present but parent isn't indexed.
Copy link
Collaborator

@adrian-prantl adrian-prantl Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///<

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, couple of minor comments inline!

}
// This is not crashing on bad input: we should only reach this if the
// internal compiler logic is faulty; see getFormForIdxParent.
llvm_unreachable("Bad form for IDX_parent");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, could this be moved in the place of the break?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if GCC would emit a warning or not about "not returning on all paths", I'll try it on compiler explorer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like GCC doesn't emit a warning in this case, so I will move the unreachable up

@@ -220,6 +221,8 @@ class Dwarf5AccelTableWriter : public AccelTableWriter {
MCSymbol *EntryPool = Asm->createTempSymbol("names_entries");
// Indicates if this module is built with Split Dwarf enabled.
bool IsSplitDwarf = false;
// Stores the DIE offsets which are indexed by this table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///

@@ -1374,7 +1374,8 @@ void DWARFLinkerImpl::emitDWARFv5DebugNamesSection(const Triple &TargetTriple) {
case DwarfUnit::AccelType::Namespace:
case DwarfUnit::AccelType::Type: {
DebugNames->addName(*DebugStrStrings.getExistingEntry(Info.String),
Info.OutOffset, Info.Tag, CU->getUniqueID());
Info.OutOffset, std::nullopt /*ParentDIEOffset*/,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is what I was alluding to in the last paragraph of the commit message, it's not clear to me how/if the parallel linker has access to the parent DIE offset of the final, merged, debug information section. I was hoping to tackle this later

DenseMap<uint64_t, MCSymbol *> DIEOffsetToAccelEntryLabel;

for (auto Offset : IndexedOffsets)
DIEOffsetToAccelEntryLabel[Offset] = Asm->createTempSymbol("symbol");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would .insert({Offset, Asm->...}) also work here?

@adrian-prantl adrian-prantl self-requested a review January 9, 2024 18:58

static uint32_t constexpr NumBitsIdxParent = 2;

uint8_t encodeIdxParent(std::optional<dwarf::Form> MaybeParentForm) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

auto AbbrevIt = Abbreviations.find(AbbrvTag);
assert(AbbrevIt != Abbreviations.end() &&
"Why wasn't this abbrev generated?");
assert(getTagFromAbbreviationTag(AbbrevIt->first) == Entry.getDieTag() &&
"Invalid Tag");

auto EntrySymbolIt = DIEOffsetToAccelEntryLabel.find(Entry.getDieOffset());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With split dwarf we can have multiple DIEs with the same offset. Won't this potentially return the wrong label, or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question. Maybe the keys here need to be a pair of UnitID + DIEOffset?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually never mind. This will work for split-dwarf. Since there should be only one CU per module.
But I think collision is still possible with CU/TUs since offsets are relative to CU/TU, or with lto. I think multiple CUs can be in one module with that.
Yeah I think UnitID should be combined with DIEOffset.

BTW there is a code that suppose to unique things, but I don't think it's working right now:
#74391
Need to circle back to it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though IIUC, in AsmPrinter you'd have at most one .o and one .dwo file, so you may be able to get away with having just one extra bit for the disambiguation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will work on integrating UnitID next

Copy link
Contributor Author

@felipepiovezan felipepiovezan Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayermolo @adrian-prantl can we assume the UnitID of the parent is always the same UnitID of the child?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was a silly question, a parent is a parent in the DIE tree, so they are in the same Unit by definition.
Please let me know if you think this is wrong

@dwblaikie
Copy link
Collaborator

DW_FORM_flag_present, when the entry has a parent that is not in the table (that is, the parent doesn't have a name, or isn't allowed to be in the table as per the DWARF spec). This is space-efficient, since it takes 0 bytes.

I think the DWARF spec does allow for adding un-indexed entries to the index, and specifically for unnamed entries having an entry like "(anonymous namespace)" for instance (or other anonymous names) - maybe? Hmm, can't find the words I'm thinking of off-hand... but I thought they were there. Ah, here:

It is possible that an indexed debugging information entry has a parent that is not indexed (for example, if its parent does not have a name attribute). In such a case, a parent attribute may point to a nameless index entry (that is, one that cannot be reached from any entry in the name table), or it may point to the nearest ancestor that does have an index entry

Not suggesting we want to/need to do that, just clarifying that it's explicitly allowed by the standard - if that's useful.

Smaller encoding's are better, so I'm not exactly complaining about the use of present, but just in case that has unfortunate tradeoffs in some other way, there are other options.

@clayborg
Copy link
Collaborator

clayborg commented Jan 9, 2024

One question: This won't cause a forward declaration to show up in the .debug_names will it? I was thinking about this scenario, not sure if it can happen, but:

class A { // Forward decl only for 'A'
  class B { // full definition for class 'B'
  ...
  };
};

I want to make sure that if we enable this, it is fine to have an accelerator entry for class A, but it shouldn't be included in the name lookup if class A is just a forward declaration. We will need to point to an entry for class A for in the parent index in the entry for class B, but we shouldn't have an entry for class A that allows it to be found as a top level search as we don't want to try and lookup this class and find a class that is only a declaration.

@felipepiovezan
Copy link
Contributor Author

One question: This won't cause a forward declaration to show up in the .debug_names will it? I was thinking about this scenario, not sure if it can happen, but:

class A { // Forward decl only for 'A'
  class B { // full definition for class 'B'
  ...
  };
};

I want to make sure that if we enable this, it is fine to have an accelerator entry for class A, but it shouldn't be included in the name lookup if class A is just a forward declaration. We will need to point to an entry for class A for in the parent index in the entry for class B, but we shouldn't have an entry for class A that allows it to be found as a top level search as we don't want to try and lookup this class and find a class that is only a declaration.

This patch is not adding anything to the accelerator table that wasn't previously there, so it shouldn't be a problem.

it is fine to have an accelerator entry for class A, but it shouldn't be included in the name lookup if class A is just a forward declaration.

Forward declarations are never part of the accel table, the spec requires that:

All non-defining declarations (that is, debugging information entries with a DW_AT_declaration attribute) are excluded.

Is your concern that somehow some translation unit forward declares class A and then attempts to define a class B inside of A? There is no syntactical mechanism to do so in C++; note that, in your example, we are defining A. I believe it is literally impossible to type a program that does this in C++.

I'm not sure about other languages; based on the way the DWARF spec is written, it doesn't even acknowledge this possibility (because of the clause I quoted above). If other languages allow this type of thing, their compiler should probably not use IDX_Parents, because a parent that is only a forward declaration is forbidden from being inserted in the table, so the chain would be broken.

@felipepiovezan
Copy link
Contributor Author

I think the DWARF spec does allow for adding un-indexed entries to the index, and specifically for unnamed entries having an entry like "(anonymous namespace)" for instance (or other anonymous names) - maybe? H

One pedantic but important distinction here is that anonymous namespaces are not quite part of the "un-indexed" debate, as the spec treats them as if they had a name:

DW_TAG_namespace debugging information entries without a DW_AT_name attribute are included with the name “(anonymous namespace)”.

So they are indexed like any other named namespace.

It is possible that an indexed debugging information entry has a parent that is not indexed (for example, if its parent does not have a name attribute). In such a case, a parent attribute may point to a nameless index entry (that is, one that cannot be reached from any entry in the name table), or it may point to the nearest ancestor that does have an index entry

Not suggesting we want to/need to do that, just clarifying that it's explicitly allowed by the standard - if that's useful.

Smaller encoding's are better, so I'm not exactly complaining about the use of present, but just in case that has unfortunate tradeoffs in some other way, there are other options.

Yup, this is a great point! The original implementation was doing what the spec suggested (pointing to a "fake" entry), but honestly the code is a bit more complicated and uses a lot more space. For now, this seems to be a win-win situation, but if some other trade-off comes up, we can totally explore the other paths.

@dwblaikie
Copy link
Collaborator

One question: This won't cause a forward declaration to show up in the .debug_names will it? I was thinking about this scenario, not sure if it can happen, but:

class A { // Forward decl only for 'A'
  class B { // full definition for class 'B'
  ...
  };
};

I want to make sure that if we enable this, it is fine to have an accelerator entry for class A, but it shouldn't be included in the name lookup if class A is just a forward declaration. We will need to point to an entry for class A for in the parent index in the entry for class B, but we shouldn't have an entry for class A that allows it to be found as a top level search as we don't want to try and lookup this class and find a class that is only a declaration.

This patch is not adding anything to the accelerator table that wasn't previously there, so it shouldn't be a problem.

it is fine to have an accelerator entry for class A, but it shouldn't be included in the name lookup if class A is just a forward declaration.

Forward declarations are never part of the accel table, the spec requires that:

All non-defining declarations (that is, debugging information entries with a DW_AT_declaration attribute) are excluded.

Is your concern that somehow some translation unit forward declares class A and then attempts to define a class B inside of A? There is no syntactical mechanism to do so in C++; note that, in your example, we are defining A. I believe it is literally impossible to type a program that does this in C++.

I'm not sure about other languages; based on the way the DWARF spec is written, it doesn't even acknowledge this possibility (because of the clause I quoted above). If other languages allow this type of thing, their compiler should probably not use IDX_Parents, because a parent that is only a forward declaration is forbidden from being inserted in the table, so the chain would be broken.

C++ doesn't allow it, but clang does generate this for C++ when using type homing/-fno-standalone-debug. Try this:

struct A {
  A();
  struct B { };
};
A::B b;

compiled with -fno-standalone-debug (or on a non-mac platform, where that's the default): https://godbolt.org/z/rYvPxPMcY

@clayborg
Copy link
Collaborator

C++ doesn't allow it, but clang does generate this for C++ when using type homing/-fno-standalone-debug. Try this:

struct A {
  A();
  struct B { };
};
A::B b;

compiled with -fno-standalone-debug (or on a non-mac platform, where that's the default): https://godbolt.org/z/rYvPxPMcY

I was hoping to hear this response as I figured it must be, but wanted to make sure.

@felipepiovezan
Copy link
Contributor Author

felipepiovezan commented Jan 10, 2024

C++ doesn't allow it, but clang does generate this for C++ when using type homing/-fno-standalone-debug. Try this:

struct A {
  A();
  struct B { };
};
A::B b;

compiled with -fno-standalone-debug (or on a non-mac platform, where that's the default): https://godbolt.org/z/rYvPxPMcY

I was hoping to hear this response as I figured it must be, but wanted to make sure.

I wasn't aware of that flag, but we can probably fix it by saying: "If B's parent is a declaration, don't add B's parent offset" to the table, i.e., B won't have an IDX_Parent. This is needed so that we fallback to the mechanism of parsing the entire CU (this upcoming commit implements the query itself and the fallback mechanism)

Isn't this flag making it impossible to find A / destroying accelerator tables? Looking at the godbolt link, any queries for A in the accelerator table will return false, and debuggers will never find it, since accelerator tables are not allowed to produce false negatives (and they are not allowed to return declarations either)

@felipepiovezan
Copy link
Contributor Author

Pushed two fixups: one with the NFC changes requested, one changing IDX_Parent so that it is not included when the parent is a forward declaration (see the test added with that commit)

@felipepiovezan
Copy link
Contributor Author

Latest fixup addresses the UnitID issue

@dwblaikie
Copy link
Collaborator

C++ doesn't allow it, but clang does generate this for C++ when using type homing/-fno-standalone-debug. Try this:

struct A {
  A();
  struct B { };
};
A::B b;

compiled with -fno-standalone-debug (or on a non-mac platform, where that's the default): https://godbolt.org/z/rYvPxPMcY

I was hoping to hear this response as I figured it must be, but wanted to make sure.

I wasn't aware of that flag, but we can probably fix it by saying: "If B's parent is a declaration, don't add B's parent offset" to the table, i.e., B won't have an IDX_Parent. This is needed so that we fallback to the mechanism of parsing the entire CU (this upcoming commit implements the query itself and the fallback mechanism)

Isn't this flag making it impossible to find A / destroying accelerator tables? Looking at the godbolt link, any queries for A in the accelerator table will return false, and debuggers will never find it, since accelerator tables are not allowed to produce false negatives (and they are not allowed to return declarations either)

Given that .debug_names were derived from .apple_names which never had to deal with this situation, it's possible there's missing features/incompatibilities to deal with.

So, for now, a B entry could, yeah, be missing its parent info and the consumer could find B then check parse some DIEs to check the parentage. Though finding the parents of a given DIE might be expensive/require more parsing - perhaps we could put an unnamed entry for A to refer to as the parent? Then it'd be relatively quick to check only that DIE to see what its name is?

@felipepiovezan
Copy link
Contributor Author

So, for now, a B entry could, yeah, be missing its parent info and the consumer could find B then check parse some DIEs to check the parentage.

Yeah, that's what the implementation will do: just fall back to existing behavior. We won't regress existing behavior for the cases where no-standalone-debug causes debug information to be deleted.

perhaps we could put an unnamed entry for A to refer to as the parent? Then it'd be relatively quick to check only that DIE to see what its name is?

It can't be unnamed, because we need to check that the name is "A". But at the same time, it cannot be named because a debugger querying just "A" should never find it (it is a declaration). It really sounds like no-standalone-debug was not considered during standardization.

But even using no-standalone-debug, with the current version of the patch we will still improve queries for DIEs that were not optimized away, and not regress ones that were.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for doing this @felipepiovezan. I left two code style suggestions inline but while they're common in LLVM they're really a matter of preference so feel free to ignore them if this is more consistent with the surrounding code.

Comment on lines 321 to 324
auto OffsetAndId = getParentDieOffsetAndUnitID();
if (!OffsetAndId)
return {};
return OffsetAndId->offset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save one line? ;-)

if (auto OffsetAndId = getParentDieOffsetAndUnitID()) 
  return OffsetAndId->offset();
return {};

Comment on lines 406 to 411
auto *Parent = Die.getParent();
if (!Parent)
return {};
if (Parent->findAttribute(dwarf::Attribute::DW_AT_declaration))
return {};
return Parent->getOffset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not

if (auto *Parent = Die.getParent()) 
  if (Parent->findAttribute(dwarf::Attribute::DW_AT_declaration))
    return Parent->getOffset();
return {};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is flipped in the suggestion -- it should be !Parent->find... -- but it still looks better, so I will change it!

@ayermolo
Copy link
Contributor

Maybe an extra test to capture when DIEs have the same offsets?

@felipepiovezan
Copy link
Contributor Author

Maybe an extra test to capture when DIEs have the same offsets?

Do you have any suggestions on how to create this situation? It seems hard to control the offset of DIEs for tests that start from .ll inputs 🤔

@ayermolo
Copy link
Contributor

ayermolo commented Jan 12, 2024

Maybe an extra test to capture when DIEs have the same offsets?

Do you have any suggestions on how to create this situation? It seems hard to control the offset of DIEs for tests that start from .ll inputs 🤔

Two TUs with similar structure should have the same relative offsets.
Something like:
clang++ -g2 -gpubnames -fdebug-types-section main.cpp

struct Foo {
  char f1;
};
struct Foo2 {
  char f2;
};

Foo fooGlobal;
Foo2 foo2Global;
Name 4 {
      Hash: 0x7C952063
      String: 0x0000005d "char"
      Entry @ 0xcc {
        Abbrev: 0x48c
        Tag: DW_TAG_base_type
        DW_IDX_type_unit: 0x00
        DW_IDX_die_offset: 0x00000033
      }
      Entry @ 0xd3 {
        Abbrev: 0x48c
        Tag: DW_TAG_base_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000033
      }
    }

Edit: Although probably in different Name bucket. For when uniqueness code in AccelTableBase::finalize is fixed.

@felipepiovezan
Copy link
Contributor Author

felipepiovezan commented Jan 16, 2024

Edit: I think this bug is fixed in #77511 ?

..

Thanks for the example @ayermolo, this works quite will! I will update the PR shortly.

Related to your example -- but unrelated to this PR, as this also happens without any of the commits here --, I noticed that we add multiple accelerator entries for those types, entries with and without unit indices.
I.e. we add an entry pointing to the declaration present in the CU?
For example:

namespace MyNamespace {
struct MyStruct1 {
  char c1;
};
struct MyStruct2 {
  char c2;
};
}

MyNamespace::MyStruct1 gv1;
MyNamespace::MyStruct2 gv2;

Compiling with:

clang++ -target x86_64 -c two_tus.cpp -g2 -gpubnames -fdebug-types-section -o - | \
llvm-dwarfdump - --debug-names | \
grep MyStruct2 -A14

We see:

  Bucket 0 [
    Name 1 {
      Hash: 0xE186C022
      String: 0x00000037 "MyStruct2"
      Entry @ 0xc9 {
        Abbrev: 0x26c
        Tag: DW_TAG_structure_type
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000025
      }
      Entry @ 0xd0 {
        Abbrev: 0x268
        Tag: DW_TAG_structure_type
        DW_IDX_die_offset: 0x00000034
      }
    }
  ]

I can't explain why this second entry is added. Looking at the debug-info section, the offset 0x34 is an offset to the declaration inside the CU:

0x00000029:   DW_TAG_namespace
                DW_AT_name      ("MyNamespace")

0x0000002b:     DW_TAG_structure_type
                  DW_AT_declaration     (true)
                  DW_AT_signature       (0x8b108ad18e93cb95)

0x00000034:     DW_TAG_structure_type
                  DW_AT_declaration     (true)
                  DW_AT_signature       (0x690ca39e0eeaec8f)

0x0000003d:     NULL

But declarations are not supposed to show up in debug-names...

@felipepiovezan
Copy link
Contributor Author

added test as suggested by @ayermolo
I believe we're looking good to merge now?

@clayborg
Copy link
Collaborator

added test as suggested by @ayermolo I believe we're looking good to merge now?

The above forward declarations should not end up in the name lookup as that will violate the DWARF spec. It is ok for an entry to be in the list of entries, but any foward declarations should not be able to be looked up by name. It is ok for an entry to be in there for the parent index stuff, but we don't want to lookup MyStruct2 and find either entries from:

0x00000029:   DW_TAG_namespace
                DW_AT_name      ("MyNamespace")

0x0000002b:     DW_TAG_structure_type
                  DW_AT_declaration     (true)
                  DW_AT_signature       (0x8b108ad18e93cb95)

0x00000034:     DW_TAG_structure_type
                  DW_AT_declaration     (true)
                  DW_AT_signature       (0x690ca39e0eeaec8f)

0x0000003d:     NULL

@felipepiovezan
Copy link
Contributor Author

added test as suggested by @ayermolo I believe we're looking good to merge now?

The above forward declarations should not end up in the name lookup as that will violate the DWARF spec.

As I mentioned in the comment, this is not related to any of the IDX_Parent work; it happens in baseline LLVM as well.
I believe it is a bug that just got fixed in #77511

@ayermolo
Copy link
Contributor

added test as suggested by @ayermolo I believe we're looking good to merge now?

The above forward declarations should not end up in the name lookup as that will violate the DWARF spec.

As I mentioned in the comment, this is not related to any of the IDX_Parent work; it happens in baseline LLVM as well. I believe it is a bug that just got fixed in #77511

That fix was more about in which accelerator table entry would get added, which would lead to runtime assert.

@felipepiovezan
Copy link
Contributor Author

added test as suggested by @ayermolo I believe we're looking good to merge now?

The above forward declarations should not end up in the name lookup as that will violate the DWARF spec.

As I mentioned in the comment, this is not related to any of the IDX_Parent work; it happens in baseline LLVM as well. I believe it is a bug that just got fixed in #77511

That fix was more about in which accelerator table entry would get added, which would lead to runtime assert.

Interesting, I will open an issue so that we can investigate this separately then

@felipepiovezan
Copy link
Contributor Author

Created #78367

@ayermolo
Copy link
Contributor

added test as suggested by @ayermolo I believe we're looking good to merge now?

LGTM thanks!

@felipepiovezan
Copy link
Contributor Author

Addressed Jonas's suggestions

@felipepiovezan
Copy link
Contributor Author

@clayborg @dwblaikie any other concerns?

@felipepiovezan felipepiovezan merged commit b667783 into llvm:main Jan 19, 2024
4 checks passed
@felipepiovezan felipepiovezan deleted the felipe/idx_parent_entry branch January 19, 2024 17:19
Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor bits of feedback (except the bit twiddling thing - that's one I think is especially worth some more attention/better ergonomics)

Comment on lines +260 to +267
struct OffsetAndUnitID : std::pair<uint64_t, uint32_t> {
using Base = std::pair<uint64_t, uint32_t>;
OffsetAndUnitID(Base B) : Base(B) {}

OffsetAndUnitID(uint64_t Offset, uint32_t UnitID) : Base(Offset, UnitID) {}
uint64_t offset() const { return first; };
uint32_t unitID() const { return second; };
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want a named structure ,perhaps we could avoid using a pair and accessors and have a plain struct with a public offset and unitID member?

std::optional<uint64_t>
DWARF5AccelTableData::getDefiningParentDieOffset(const DIE &Die) {
if (auto *Parent = Die.getParent();
Parent && !Parent->findAttribute(dwarf::Attribute::DW_AT_declaration))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going to be the distinction between a parent that's a declaration, and a parent that doesn't have a name? Should there be a distinction between these cases?


static uint32_t constexpr NumBitsIdxParent = 2;

uint8_t encodeIdxParent(const std::optional<dwarf::Form> MaybeParentForm) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM doesn't usually use top-level const on values - I'd suggest removing it here (it can be confusing - makes people think maybe the & was missing).

NoParent = 2, /// Parent information missing.
};

static uint32_t constexpr NumBitsIdxParent = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I /think/ we usually put constexpr before the type name?

Comment on lines +435 to 439
static uint32_t constexpr ParentBitOffset = dwarf::DW_IDX_type_hash;
static uint32_t constexpr TagBitOffset = ParentBitOffset + NumBitsIdxParent;
static uint32_t getTagFromAbbreviationTag(const uint32_t AbbrvTag) {
return AbbrvTag >> LowerBitSize;
return AbbrvTag >> TagBitOffset;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to avoid stuffing more things into an int with bit fiddling - I tried to push back on this in the original review but we didn't manage to avoid it. Was hoping to get this cleaned up after-the-fact... (#70515, @ayermolo )

Like maybe a struct with a bitfield, at least?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to avoid stuffing more things into an int with bit fiddling - I tried to push back on this in the original review but we didn't manage to avoid it. Was hoping to get this cleaned up after-the-fact... (#70515, @ayermolo )

Like maybe a struct with a bitfield, at least?

Hmm weird. Just got notification for this.
Yeah let me circle back to this after I put up PR for bolt initial implementation (without parent support). Will try a different approach there first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect David forgot to click the "submit review" button?

But yeah, more importantly, I think we need to make these abbreviation codes count from 1...num_abbreviations so that the ULEB encoding takes a single byte.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll incorporate it into my bolt implementation from start.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect David forgot to click the "submit review" button?

Yep, drowning in reviews/todos lately, so it was some late feedback I figured I'd send anyway,s ince I'd written it and some of it was still applicable.

But yeah, more importantly, I think we need to make these abbreviation codes count from 1...num_abbreviations so that the ULEB encoding takes a single byte.

Not sure I understand quite what you're saying here - but mostly it seemed like packing these things into the "AbbrvTag" is more a shortcut to being able to use a single int in the map/lookup/hash table, rather than the most readable piece of code & it'd be good to make the code more legible/self-descrpitive rather than having this bit fiddling.

Copy link
Contributor

@ayermolo ayermolo Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining the two. Something like:

  union AbbrevDescriptor {
    struct {
      uint32_t CompUnit : 1;
      uint32_t TypeUnit : 1;
      uint32_t DieOffset : 1;
      uint32_t Parent : 1;
      uint32_t TypeHash : 1;
      uint32_t Tag : 27;
    } Bits;
    uint32_t Value = 0;
  };

DWARF5AcceleratorTable::TagIndex DWARF5AcceleratorTable::getTagFromAbbreviationTagAndIndex(const uint32_t AbbrvTag) const {
  auto Iter = AbbrevTagToIndexMap.find(AbbrvTag);
  assert(Iter != AbbrevTagToIndexMap.end() && "No index for an abbreviation tag");
  AbbrevDescriptor AbbrvDesc;
  AbbrvDesc.Value = AbbrvTag;
  return {(uint32_t)AbbrvDesc.Bits.Tag, Iter->second};
}
/// Constructs a unique AbbrevTag that captures what a DIE accesses.
/// Returns an index in the Abbreviation table.
uint32_t DWARF5AcceleratorTable::constructAbbreviationTag(
    const unsigned Tag,
    const std::optional<DWARF5AccelTable::UnitIndexAndEncoding> &EntryRet,
    const std::optional<DWARF5AccelTable::UnitIndexAndEncoding>
        &SecondEntryRet) {
  AbbrevDescriptor AbbrvDesc;
  auto setFields =
      [&](const std::optional<DWARF5AccelTable::UnitIndexAndEncoding> &Entry)
      -> void {
    if (!Entry)
      return;
    switch (Entry->Encoding.Index) {
    case dwarf::DW_IDX_compile_unit:
      AbbrvDesc.Bits.CompUnit = true;
      break;
    case dwarf::DW_IDX_type_unit:
      AbbrvDesc.Bits.TypeUnit = true;
      break;
    case dwarf::DW_IDX_parent:
      AbbrvDesc.Bits.Parent = true;
      break;
    case dwarf::DW_IDX_type_hash:
      AbbrvDesc.Bits.TypeHash = true;
      break;
    default:
      return;
    }
  };
  setFields(EntryRet);
  setFields(SecondEntryRet);
  AbbrvDesc.Bits.DieOffset = true;
  AbbrvDesc.Bits.Tag = Tag;
  AbbrevTagToIndexMap.insert({AbbrvDesc.Value, (uint32_t)(AbbrevTagToIndexMap.size() + 1)});
  return AbbrvDesc.Value;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining the two. Something like:

Something like that would suffice - though I wouldn't mind if the abbrev system for debug_info DIEs/abbrevs was generalized to be usable for debug_names entries too, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  uint32_t Parent : 1;

We need two bits for the parent information, as it can have 3 states: absent, form_ref4, form_flag_present

though I wouldn't mind if the abbrev system for debug_info DIEs/abbrevs was generalized to be usable for debug_names entries too

I remember looking a bit into how that works, but it would require a bit more effort, since debug_info uses attributes whereas debug_names uses IDXs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  uint32_t Parent : 1;

We need two bits for the parent information, as it can have 3 states: absent, form_ref4, form_flag_present

though I wouldn't mind if the abbrev system for debug_info DIEs/abbrevs was generalized to be usable for debug_names entries too

I remember looking a bit into how that works, but it would require a bit more effort, since debug_info uses attributes whereas debug_names uses IDXs

ah thanks. It was a place holder as initial implementation in bolt won't support it. It will be follow up PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember looking a bit into how that works, but it would require a bit more effort, since debug_info uses attributes whereas debug_names uses IDXs

Yep, I'd expect it to be generalized into a template parameterized on the type of the attributes (llvm::dwarf::Tag and llvm::dwarf::Index)

ayermolo added a commit to ayermolo/llvm-project that referenced this pull request Feb 1, 2024
This stemps from conversatin in: llvm#77457 (comment).
Right now Abbrev code for abbrev is combination of DIE TAG and other attributes.
In the future it will be changed to be an index. Since DenseSet does not
preserve an order, added a sort based on abbrev code. Once change to index is
made, it will print out abbrevs in the order they are stored.
ayermolo added a commit to ayermolo/llvm-project that referenced this pull request Feb 1, 2024
This stemps from conversatin in: llvm#77457 (comment).
Right now Abbrev code for abbrev is combination of DIE TAG and other attributes.
In the future it will be changed to be an index. Since DenseSet does not
preserve an order, added a sort based on abbrev code. Once change to index is
made, it will print out abbrevs in the order they are stored.
ayermolo added a commit that referenced this pull request Feb 2, 2024
This stemps from conversatin in:
#77457 (comment).
Right now Abbrev code for abbrev is combination of DIE TAG and other
attributes.
In the future it will be changed to be an index. Since DenseSet does not
preserve an order, added a sort based on abbrev code. Once change to
index is
made, it will print out abbrevs in the order they are stored.
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Feb 2, 2024
This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.

When all patches related to this are merged, we are able to show that
evaluating an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/

(cherry picked from commit b667783)
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
)

This stemps from conversatin in:
llvm#77457 (comment).
Right now Abbrev code for abbrev is combination of DIE TAG and other
attributes.
In the future it will be changed to be an index. Since DenseSet does not
preserve an order, added a sort based on abbrev code. Once change to
index is
made, it will print out abbrevs in the order they are stored.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants